Picture for Mingyi Hong

Mingyi Hong

Faster Synchronous On-Policy RL via Straggler-Aware Group Sizing

Add code
Jun 01, 2026
Viaarxiv icon

EMA-Nesterov: Stabilizing Nesterov's Lookahead for Accelerated Deep Learning Optimization

Add code
May 25, 2026
Viaarxiv icon

Rethinking Muon Beyond Pretraining: Spectral Failures and High-Pass Remedies for VLA and RLVR

Add code
May 19, 2026
Viaarxiv icon

Subspace Control: Turning Constrained Model Steering into Controllable Spectral Optimization

Add code
Apr 05, 2026
Viaarxiv icon

AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

Add code
Mar 19, 2026
Viaarxiv icon

StitchCUDA: An Automated Multi-Agents End-to-End GPU Programing Framework with Rubric-based Agentic Reinforcement Learning

Add code
Mar 03, 2026
Viaarxiv icon

Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization

Add code
Feb 19, 2026
Viaarxiv icon

HiPER: Hierarchical Reinforcement Learning with Explicit Credit Assignment for Large Language Model Agents

Add code
Feb 18, 2026
Viaarxiv icon

DISPO: Enhancing Training Efficiency and Stability in Reinforcement Learning for Large Language Model Mathematical Reasoning

Add code
Feb 01, 2026
Viaarxiv icon

Scaling Unverifiable Rewards: A Case Study on Visual Insights

Add code
Dec 27, 2025
Viaarxiv icon